Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here's a PR for the work I did on making a transactional ART. I hope the info below is helpful. I'll leave it to you to merge this code/portions of the code however you would like.
Main modifications:
New
ART/directoryClass
art_indexinDB_index.hhunit-artindex.cctest intest/How to use
If you'd like to test it I believe that the
YCSB_bench.hhfile currently sets the ART as the data structure to test.The
art_indexdefined and implemented inDB_index.hhprovides the same interfaceto the transactional ART that
ordered_indexdoes for the transactional masstree.In other words,
art_indeximplements:select_rowinsert_rowupdate_rowdelete_rownontrans_getnontrans_putAll these functions are implemented for values with column accesses as well.
Range scan has been implemented on a separate branch in zyedidia/sto and seems to have some issues
in the TPCC benchmark regarding endianness. I think these issues have to do with
me misunderstanding the specification rather than a bug. Feel free to take a look at the code there.
Implementation
There are two elements to the implementation: first the underlying ART implementation which is just an implementation of an ART with optimistic lock-coupling provided by the makers of the ART. This has been modified some to reveal some more information so that the second element which is the transactional wrapper can access some needed information like the children of non-leaf nodes, and whether or not a node was upgraded during an insert.
Underlying ART
The main ART lives in
ART/. This code is modified from the code for the ART writtenby the authors of the original paper. The original source code can be found at:
https://github.com/flode/ARTSynchronized. That github repository implements a
serial ART, an optimistic lock coupling (OLC) ART, and a ROWEX locking ART. Our
ART/directory contains an adapted version of the OLC ART.
Our modifications consist mainly of exposing internal ART nodes in calls to insert,
lookup, and delete, as well as adding a version number to every node in the ART.
The ART implementation stores TIDs (
uint64_ts) using theKeytype defined inKey.h.This is simply a 128-byte buffer which uses the heap if the key exceeds 128 bytes.
The
loadKeyfunction must also be defined. This function takes a TID and an emptykey and populates the key with the actual key for that TID. This means that every
value must store its key as well.
I store values in
art_indexasinternal_elems which contain thekeyfield, storingan
lcdf::Strobject. Seeart_index.load_keywhich populates the passed in referenceto an ART key with the
lcdf::Strdata.Insert
The
insertfunction has been modified to take amake_tidcallback which will allocatea new user element element and return the TID for the ART to store. This function may
not be called on every insert. It will not be called if the key already exists in the
tree.
Insert returns 3 values as a tuple: (tid, parent, old). The tid value is the TID of
the node at that key. If the key was just inserted, tid will correspond to the new tid.
If the key already existed, the tid of the key is returned. Parent is the node struct
corresponding to the parent of the inserted node. It is null if the key already existed
before the insert. Old is an obsolete node that should be deleted. This is usually null
but if the insert caused a node upgrade then that node should be deleted but some
extra precautions need to be made.
Lookup
ART's lookup takes a standard ART key and returns the TID (if found, 0 otherwise), and
the TID's parent node. If the key was not found then it returns the node at the furthest
point in the traversal where it determined the key was not in the tree.
Remove
Remove takes an ART key and the TID to remove and returns a possible old node. This
old node may be null but if it has a value then it was made obsolete during a shrink
and should be deleted.
RCU
I removed all "Epoch" RCU calls in the code and replaced them with calls that return
nodes to be deleted to the
art_indexto rcu delete them using STO's mechanism. Ibelieve there may be still a little work to be done regarding RCU deletion on destruction
of the entire ART.
ART index
I simply took the
ordered_indexdata structure which wraps masstree and replacedthe masstree calls with ART calls.
On top of modifying the existing ordered index structure to use the ART instead of masstree, some code has also been added to deal with node upgrading in the ART.
Tests
The
art_indexpasses the unit tests intest/unit-dboindex.ccbut those testsare not especially rigorous. I wrote some more tests targeting specific functionality
in the ART (for example, making sure node upgrading aborts properly). These new tests
are located in
test/unit-artindex.cc. The tests shouldn't be implementation specificand the file also contains a masstree wrapper so the tests can be executed on masstree
as well (though the tests specifically test issues that would occur in ART rather than
masstree).
Benchmarks
The ART implementation outperforms masstree in some of the benchmarks I ran, including YCSB. The ART range scan isn't faster than masstree's however.
Extra stuff
A lot of time during the summer was spent working on the
TARTwhich can be foundin
datatype/TART.hh. This is an implementation of a transactional ART that I wrotefrom scratch but was unable to port to the interface required by the benchmarks
(the interfaced defined by
ordered_index). Once I had a more in-depth understandingof how a TART might be written and how Masstree worked, I was able to take
ordered_indexand modify the references to masstree to instead call to the modified underlyingART. That code is included in this PR but at this point is unused.